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Abstract In this paper we analyze several inexact fast augmented Lagrangian 
methods for solving linearly constrained convex optimization problems. Mainly, 
our methods rely on the combination of excessive-gap-like smoothing technique 
developed in [15] and the newly introduced inexact oracle framework front [4] . 
We analyze several algorithmic instances with constant and adaptive smooth¬ 
ing parameters and derive total computational complexity results in terms of 
projections onto a simple primal set. For the basic inexact fast augmented 
Lagrangian algorithm we obtain the overall computational complexity of or¬ 
der O (- 574 ), while for the adaptive variant we get O ( 7 ), projections onto a 
primal set in order to obtain an e—optimal solution for our original problem. 
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dual fast gradient • excessive gap • overall computational complexity. 
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1 Introduction 

Large-scale constrained convex optimization models are convenient tools to 
formulate several practical problems in modern engineering, statistics, and 
economical applications. Several important applications in such fields can be 
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modeled into this class of problems such as: linear (distributed) model pre¬ 
dictive control HUE], network utility maximization [3], or compressed sens¬ 
ing W- However, solving large-scale optimization problems is still a chal¬ 
lenge in many applications due to the limitations of computational devices and 
computer systems, as well as the limitations of existing algorithm techniques. 

In the recent optimization literature, the primal-dual first order methods 
have gained a great attention due to their low complexity-per-iteration and 
their flexibility of handling linear operators, constraints and nonsmoothness. In 
the constrained case, when complicated constraints are present, many first or¬ 
der algorithms are combined with duality or penalty strategies in order to pro¬ 
cess them. Amongst the most popular primal-dual approaches, which behaves 
very well in practice, the augmented Lagrangian (AL) strategy was extensively 
studied, e.g., in It is well known that the AL smoothing 

technique is a multi-stage strategy implying successive computations of solu¬ 
tions for certain primal-dual subproblems. In general, these subproblems do 
not have closed form solutions, which makes the AL strategies inherently re¬ 
lated to inexact first-order algorithms. In these settings, most of the complexity 
results regarding AL and fast AL algorithms are given under inexact first order 
information nmnniiEiiE]- To our best knowledge, the complexity estimates 
for the fast AL methods have been studied only in [ElE]. However, in these 
papers only outer complexity estimates are provided. It is clear that the outer 
complexity estimates in AL methods do not take into account the complexity- 
per-iteration. In many situations we can choose the smoothing parameters in 
order to perform only one outer iteration, see, e.g., El El IE], but the over¬ 
all complexity estimate is still high due to the high complexity-per-iteration. 
Trading off these quantities is a crucial question in the implementation of AL 
methods. 

In this paper we aim at improving the overall iteration complexity of fast 
AL methods using the inexact oracle framework developed in [3] based on 
a simple inner accuracy update and excessive gap-like fast gradient algo¬ 
rithms HSj. By using the inexact oracle framework, we are able to provide 
the overall computational complexity of an inexact fast AL method with con¬ 
stant smoothing parameter and of its adaptive variant. 

Our contributions. In this paper we analyze the computational complex¬ 
ity of Inexact Fast Augmented Lagrangian (IFAL) method with constant and 
adaptive smoothing parameters. Using the inexact oracle framework 0], our 
approach allow us to obtain clean and intuitive complexity results and, more¬ 
over, it facilitates the derivation of the overall computational complexity of 
our methods. 

(a) For the basic IFAL method with constant smoothing parameter, we derive 
O outer complexity estimates corresponding to simple inner accu¬ 

racy updates. We also derive the overall computational complexity of or¬ 
der O (- 574 ) projections onto a primal set, in order to obtain an e—optimal 
solution in the sense of the objective residual and feasibility violation. 
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(b) Then, we show that for an optimal choice for the smoothing parameters we 
need to perform only one outer iteration. Based on this result, we introduce 
an adaptive IFAL method with variable smoothing parameters for which 
we prove an overall computational complexity of order O ( 7 ) projections. 

(c) We show that our adaptive variant of inexact fast AL method is imple- 
mentable, i.e. it is based on computable stopping criteria. Moreover, we 
compare our results with other complexity estimates for AL methods from 
literature and highlight the advantageous features of our methods. 

Paper organization. The rest of this paper is organized as follows. In Section 
2 we define our optimization model and introduce some preliminary concepts 
related to duality and inexact oracles. In Section 3 we introduce the inexact fast 
AL method with constant smoothing parameter and analyze its computational 
complexity. To improve this complexity, in Section 4 we study an adaptive 
parameter variant and provide its complexity estimate. Finally, in Section 5, 
we compare our results with other complexity results on AL methods from the 
literature. 

Notations. We work in the Euclidean space R™ composed by column vectors. 
For it, v £ K™ we denote the inner product (u, v) = u T v and the Euclidean 
norm ||it|| = yj(u, u). For any matrix G we denote by ||G|| its spectral norm. 


2 Problem formulation and preliminaries 

We consider the following linearly constrained convex optimization problem: 

{ min f(u) 

“6 U ( 1 ) 

s.t. Gu + g = 0, 

where / : R n — > R U {+ 00 } is a proper, closed and convex function, U C K™ 
is a nonempty, closed and convex set, G £ R mxn , and g £ K m . We use U* 
to denote the optimal solution set of set of ©, which will be assumed to be 
nonempty. 

The goal of this paper is to analyze the optimization model © and to 
develop new inexact Augmented Lagrangian methods with convergence guar¬ 
antees for approximately solving ©. For this purpose, we require the follow¬ 
ing blanket assumptions which are assumed to be valid throughout the paper 
without recalling them in the sequel: 

Assumption 1 (*) The solution set U* is nonempty. The feasible set U is 
bounded and simple (i.e., the projection onto U can be computed in a closed 
form or in polynomial time ). 

(: ii ) There exist a bounded optimal Lagrange multiplier x* £ R m . 

(Hi) The objective function f has the Lipschitz gradient with the Lipschitz con¬ 
stant Lf > 0, i.e.: 


l|V/(w) — V/(u)|| < Lf\\u — w||, 


u,v £ R n . 
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If / is strongly convex, it is well-known that the Lagrangian dual function 

associated with the linear constraint Gu + g = 0 has the Lipschitz gradient. 

This setting has been extensively studied in the literature (see, e.g., [dUSMTOl 

US]). Thus, in the rest of our paper we only assume / to be a convex function 

and not necessarily strongly convex. Due to the constraint, it is clear that 

the dual function d defined by d{x) = min{/(u) + (x, Gu + g)} is, in general, 

u£U 

nonsmooth and concave, which induces the difficulties in the application of 
usual first order methods to the dual. Therefore, various (dual) subgradient 
schemes have been developed, with iteration complexities of order O (- 5 - ) d 
[ 16 |- 

Under additional mild assumptions, we aim in this paper at improving the 
iteration complexity required for solving the linearly constrained optimiza¬ 
tion problem ©■ Our approach relies on the combination between smoothing 
techniques and duality mmm- 

First, we briefly recall the Lagrangian duality framework as follows. The 
Lagrangian function and the dual function associated to the convex problem 
(ED are defined by: 

£{u, x) = f{u ) + (x, Gu + g) and d(x) = min £(u, x). 

uG U 

From Assumption El**) it follows that the convex problem (ED is equivalent 
with solving the dual formulation, i.e.: 

f* = max d(x) ( = max mm.£(u,x ) ) . (2) 

x6R m \ xGl m uGU J 

Our goal in this paper is to find an approximate primal solution for the opti¬ 
mization problem (ED- Therefore, we introduce the following definition: 

Definition 1 Given a desired accuracy e > 0, the point u e € U is called an 
e-optimal solution for the primal problem (ED if it satisfies: 

/W - f* <e and ||Gu e + g\\ < e. 

This set of optimality criteria has been also adopted by Rockafellar in m in 
the context of classical augmented Lagrangian methods. Moreover, the above 
criteria has been also used by Nesterov in m for the analysis of primal-dual 
subgradient methods. It can be easily observed that once we have an e-feasible 
point, i.e., ||Gu e + < 7 || < e, then we can also obtain a lower bound on f{u e ) — f*. 
Indeed, we have the relation: 

f* = min/ (u) + (x*, Gu + g) < f(u e ) + \\x*\\\\Gu e + g\\, 

u£U 

and thus f(u e ) — f* > — ||x*||||Git e + g||. Moreover, from a practical point of 
view, it is sufficient to find an e-feasible point u e satisfying f(u e ) — f* < e. 

Given the max-min formulation (|2ft . one can intuitively consider a double 
smoothing of the convex-concave Lagrangian function. Regarding the dual 
function, one of the most widely known smoothing strategies for obtaining an 
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approximate smooth dual function with Lipschitz continuous gradient is the 
augmented Lagrangian (AL) smoothing ffillOUl2UlQII21j . Thus, we combine the 
AL technique with a smooth approximation of the primal function and define: 

C w {u,x) = f(u) + ( x,Gu + g) + ^\\Gu + g\\ 2 - |||x|| 2 , (3) 

where g, p > 0 are two smoothness parameters. Clearly, we have the relation 
lim Cnn(u,x) = C(u,x) and, in addition, £„„(•, x) has Lipschitz continuous 

/z,p—>■ 0 

gradient with the Lipschitz constant Lc = L/+p||G|| 2 for any fixed x. Based on 
this approximation of the true Lagrangian function we also define two smooth 
approximations of the primal and dual functions / and d, respectively: 

d p {x) = min C 0p (u,x) ( = min f(u) + (x, Gu + g) + ?-\\Gu + g\\ 2 ] , 
u£U y utiU z J 

fp{u) = max C p o(u, x) ( = f(u) + ^\\Gu + g\\ 2 ) . 

Let define the optimal solutions of the two previous optimization problems: 

u„(x) = argmin£o p (u, a;) and Xn(u) = arg max Cno(u, x) (= —(Gu + p)) . 

u£U y fj, J 

Clearly, both functions and d p are smooth approximations of / and d, 
respectively. In particular, we observe that the smoothed primal function f p 
has Lipschitz continuous gradient with the Lipschitz constant = Lf + 
^t||G|| 2 . Moreover, the smoothed dual function d p is concave and its gradient 
S7d p (x) = Gu p (x) + g is Lipschitz continuous with the Lipschitz constant 
Ld p = -. We emphasize again that, in most practical cases, u p (x) cannot 
be computed exactly, but within a pre-specified accuracy, which leads us to 
consider the inexact oracle framework introduced in [4] for the analysis of 
inexact first order algorithms. 

Recall that a smooth function </> : Q —> R is equipped with a first-order 
(S, L)-oracle if for any y £ Q we can compute (^> 5 ,L(y), V05 i i(y)) elxR" 
such that the following bounds hold on </> (so-called inexact descent lemma) [3] : 

o < (f(x) - {(j>s,L{y) + {V(j> 5 ,L{y),x - y)) < ^\\x- y\\ 2 + (5 Va :,y£Q. (4) 

If we define u p {x ) € U as the inexact solution of the inner subproblem in 
u satisfying: 

0 < C 0 p(u p (x), x) - d p {x) < 6, (5) 

then using the notation W x Cqp(u p (x)^x) = Gu p (x) + g , we are able to provide 
the following important auxiliary result, whose proof can be found in m-- 

Lemma 1 (TO Let 6 > 0 and u p (x) £ U satisfy ©. Then, for all x , y, we 
have: 

0<C O p{u p (y),y) + (V x Cop{u p (y),y),x-y) - d p {x) < L dp \\x-y\\ 2 + 26. (6) 
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The relation ([G]) implies that d p is equipped with a (2<5, 2Lj )-oracle with 
= Cop{u p (x),x) and S/(/)s,l(x) = S7 x C 0p (u p (x),x) = Gu p (x) + g. It is 
important to note that the analysis considered in [12}[l8i|19] requires to solve 
the inner problem with higher accuracy of order 0(S 2 ), i.e.: 

C 0p {u p {x),x) — d p (x) < 0(S 2 ) 
in order to ensure bounds on d p {x) of the form: 

0 < C 0p (u p {y),y) + ( W x C 0p {u p {y ), y),x - y) - d p {x) 

< ^||x - y\\ 2 + (l + ^/2L^Du) 6, 

where Du is the diameter of the bounded convex domain U. It is obvious that 
our approach in this paper is less conservative, requiring to solve the inner 
problem with less accuracy than in HMSHH]. As we will see in the sequel, 
this will also have an impact on the total complexity of our method compared 
to those in the previous papers. 


3 Inexact fast augmented Lagrangian method 

In this section we propose an augmented Lagrangian smoothing strategy sim¬ 
ilar to the excessive gap technique introduced in Dansma- Typical excessive 
gap strategies are based on primal-dual fast gradient methods, which main¬ 
tain at each iteration some excessive gap inequality. Using this inequality, the 
convergence of the outer loop of the algorithm is naturally determined. 

In this paper, we use an excessive gap-like inequality, which holds at each 
outer iteration of our algorithm. Given the dual smoothing parameter p , inner 
accuracy 5 and outer accuracy e, we further develop an Inexact Fast Aug¬ 
mented Lagrangian (IFAL) algorithm for solving (JT]): 

IFAL(p, e) Algorithm 

Initialization: Give u° € U, x° £ R m and po > 0. 

Iterations: For k = 0,1,..., perform the following steps: 

1. x k = (1 - r k )x k + j^{Gu k + g) 

2. Find u p {x k ) such that C 0p (u p (x k ),x k ) — d p (x k ) < S k 

3 . u k+1 = (1 - T k )u k + T k u p {x k ) 

4. x k+1 = x k + p(Gu p (x k ) + g) 

5. Set p k +\ = (1 - r k )pk 

6. If a stopping criterion holds then STOP and return: (it fc , x k ), 

End 


The update rules for r k are derived below. We define as in [T5 j fl8 | IT9 ] 
the smoothed duality gap A k = f Pk (u k ) — d p {x k ). Based on this smoothed 
duality gap A kl we further provide a descent inequality, which will facilitate 
the derivation of a simple inner accuracy update and of the total complexity 
of the IFAL algorithm. 
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Lemma 2 Let p > 0 and {(u k ,x k )} be the sequences generated by the IFAL 

t 2 

Algorithm. If the parameter t k £ (0,1) satisfies 1 _ k Tk < p k p for all k > 0, i/ien 
the following excessive gap inequality holds: 


A k+ \ < (1 — Tfc)zifc + 25k- (7) 

Proof Firstly, we observe that, from strong convexity of ||-|| 2 results: 

A k = max f{u k ) + (x, Gu k + g) - ^ ||x|| 2 - d p {x k ) (8) 

> /(u fc ) + (x,Gu fe + 5 )-^||x|| 2 + ^||x-x p J?/)|| 2 -d p (x fc ) Vx £ M m . 
Secondly, given any x £ R m , from the definition of C$ p , note that we have: 
C Qp (u p (x k ), x k ) + (S7 x C 0p {u p (x k ), x k ), x-x k ) = f{u p (x k )) + (x, Gu p (x k ) + p) 

+ |||Gzi p (x fc ) + ff || 2 . (9) 

Multiplying both sides of © with 1 — r k and combining with ©, we obtain: 
(1 -T k )A k > (1 - r fc ) ( f{u k ) + {x,Gu k +g) - ^f\\x\\ 2 + ty\\x - x Pk {u k )\\ 2 
- d p (x k )) + T k (^f(u p {x k )) + (x, Gu p (x k ) + g) + |||Gw p (x fc ) + g\\ 2 
- C 0p (u p {x k ), x k ) - ly x Co p {u p (x k ), x fc ), x - x fe )). 


Using the convexity of / we further have: 

(1 - T k )A k > /((1 - T k )u k + T k Up(x k )) + (x, G ((1 - Tk)u k + T k Up(x k )) + g) 
k-\-l k -\-1 

- ll^ll 2 - (! - T k )d p {x k ) + ||x - x Pk {u k ) || 2 

- x k ( C 0p {u p (x k ),x k ) + (X7 x C 0p (u p (x k ),x - x k ))). 

Further, from the left hand side of inexact descent lemma © we get: 

,.k -\-1 

(1 -r k )A k > f(u k+1 ) + ( x,Gu k+1 + g)- —||z|| 2 - C 0p (u p (x k ),x k ) 

. .fe+i 

- y x c 0 p(u p (x k ),x k ), (1 - T k )x k + T k x-x k ) + II* - ar/i fc (« fc 


fc \ ||2 


# fc+1 


> f(u k+1 ) + (x, Git fe+1 + g)-— ||x|| 2 - C 0p {u p {x k ),x k ) 


,k +1 


- max (V x Cop(u p {x ),x ),z-x ) -—y -||z - x|| . 

2:=(1—r/e)tc fc -|-r fc a; 

r 2 

Using the assumption that and the right hand side of inexact 

descent lemma © we further derive: 


// ' 

(1 -T k )A k > f(u k+1 ) + (x,Gu k+1 +g) -— ||x|| 2 -d p (x fe+1 )-24 Vx £ M’ 

Choosing x = j^ TT (Gu k+1 + g) we obtain our result. □ 













Patrascu, Necoara, Tran-Dinh 


Similar excessive gap inequalities have been proved in [TKifisinra] and then 
used for analyzing the convergence rate of excessive gap algorithms. However, 
the main results in Il5lll8lfl9] only concern with the outer iteration complexity 
and do not take into account the necessary inner computational effort for find¬ 
ing Up(-), since it is very difficult to estimate this quantity using the approaches 
presented in these papers. 

In the sequel, based on our approach, we provide the total computational 
complexity of the IFAL algorithm (including inner complexity) for attaining an 
e—optimal solution of problem ©. First, we notice that if we assume po = 

by taking into account that p k = /io Jp =0 (l — Tj), a simple choice of sequence 
2 

Tfc satisfying T ^ < PkP is given by r k = 

Theorem 2 Let p,e > 0,/x o = | and r k = ^ 3 . Let {( u k ,x k )} be the se¬ 
quences generated by the IFAL(p, e) Algorithm. If we choose the inner accu¬ 
racy 5 k = 2 (FF 3 )’ then the following estimates on objective residual and the 
feasibility violation hold: 


f( uk ) f < (fcfl)(°fcf2) + 2 > 
\\Gu k + g\\ 


(8e) 1/2 
p 1 / 2 (fc+1)' 


( 10 ) 


Proof From 0 it can be derived that: 


k k k 

A k +i < Ap IT(1 — Tj) + 2 6 k + 2 5 k —j P (1 — Tj) 

j — 0 2 = 1 j—k — 2-j-l 

k k / k 


= A 0 Y[(l-r i ) + 2Y[ { l-rM £ 


<5,: 


i=o 


3=0 




(ii) 


Observing that ]p = o(l — T j) = (fc +2 ) 2 (fc-|- 3 ) 1 we can further bound the cumula¬ 
tive error as follows: 


2 IF 1 -’J> Et 


i=o 


i_0 II (1 — T s)' 

s =0 


4 •v (* + 2 ){i + 3)$i 

(fc + 2)(fc + 3)^ 2 

e k{k + 5) ^ e 

2 (fc + 2)(fc + 3) “ 2' 


Using this bound and /* > d p (x) for any x £ R m , from CD we have: 

Ap _£ 

(k + l)(fc + 2) 2' 


/(«*) - r < u (u k ) - d p {x k ) = A k < 


( 12 ) 
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On the other hand, using the KKT conditions of m, we have: 


> ^-\\Gu k +g\\ 2 + {Vf{u*),u k ~u*) 
>^- k \\Gu k +g\\ 2 -\\x*\\\\Gu k +g\\, 

which is the first estimate in m Combining d and d, we obtain: 


U(u k ) - r = /(«*) + tf\\Gu k + g\\ 2 - r 


(13) 


II Gu k + g || < /Xfe||o:*|| + 


2/rfc/io 


-I 1/2 


2/ifee 


< 


(k + 1 ) (k + 2 ) 

/folk*II + ( 2 p 0 A )) 1/2 . ( 2 /r 0 e ) 1/2 


(k + l)(k + 2) k + 1 

which is the second estimate of (USD- □ 

Now, we provide the overall computational complexity of the IFAL Algo¬ 
rithm in terms of the number of projections on the primal set U. 

Theorem 3 Let p, e > 0, po = ^ and t fc = _ Assume that at each outer it¬ 

eration k of the IFAL(p, e) Algorithm, Nesterov’s optimal method m is called 
for computing an approximate solution u p [x k ) of the inner subproblem © 
such that £op(u p (x k ), x k ) — d p (x k ) < 5k (= 2 (fc+ 3 ) )■ Then, for attaining an 
e— optimal solution in the sense of Definition]^ the IFAL Algorithm performs 
at most: 

167 Du [2(A/ + p||G|| 2 )] 1/2 

e 5/4 

projections on the simple set U, where the constant 7 has the following expres¬ 


sion: 7 = max | (e/3) 1 / 2 , (32/p) 1 / 2 , ^MhUL + 2y / §|) 


1/2 


Proof Note that from Theorem [5] we observe that for attaining an e—optimal 
solution, the number of outer iterations N out must satisfy: 


N out < —^2 max {( 2 A )) 1/2 , ( 8 p 0 ) 1/2 , (2/x 0 ||x*|| + 2^/2p 0 A 0 


1/2 


7 

e l/ 2 ' 


On the other hand, at each outer iteration k , Assumption [TJ Hi) implies that 
Nesterov’s optimal method [14] applied on the inner subproblem © performs: 


» r = 2 d jl =2Du , + 3 > 


(14) 
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inner iterations (i.e., projections on U). Using this estimate we can easily derive 
the total number of projections on the simple set U necessary for attaining an 
e—optimal point: 


N° 


J2 N t = 2D u 


k—0 


2 (Lf + p\\Gr) 


1/2 N° 


I> + 3 ) ,/2 


k =0 


< 2Du 


2(L f + p\\Gr) 


1/2 


(iV out + 3) 3/2 


< 


l^Du \2(L f + p\\Gr)\ 

,5/4 


1/2 


The theorem is proved. □ 


Remark 1 We observe that A 0 < A 0 , where A 0 = f Po (u°) — Cq p (u p (x°), x°) + 
So can be computed explicitly. Then, using this upper bound in our esti¬ 
mates, makes Algorithm IFAL implement able, i.e., the algorithm stops with 
an e—optimal solution at the outer iteration N out provided that the following 
two computable conditions hold: 

|| Gu k + g\\ < e and N out > 



Moreover, as suggested, e.g., in [l8l[T9] . if we choose the primal-dual starting 
points u° = u p ( 0) and x° = -j-(Gu° + g), respectively, then we even have 

A 0 < Sq. 


4 Adaptive inexact fast augmented Lagrangian method 

In this section we analyze the overall iteration complexity of the IFAL Al¬ 
gorithm for an optimal choice of the smoothing parameter p and then we 
introduce an adaptive variant of this with the same computational complexity 
(up to a logarithmic factor) that is fully implementable in practice. 

First, assume that we adopt the initialization strategy suggested in Re¬ 
mark [1] such that A 0 < Sq- Therefore, using this strategy and the previous 
assumption that Sk = 2 (k+ 3 ) ■ ou t er complexity can be estimated as: 

"“‘s ^72”>“{<'/ 3 )‘ /2 .(32 /p)' /2 , (5M +2 yg) 7 J=:jL. (15) 

Note that the variation of the smoothing parameter p induces a trade-off be¬ 
tween the number of the outer iterations and the complexity of the inner 
subproblem, i.e. for a sufficiently large p we have a single outer iteration, but 
a complex inner subproblem. The next result provides an optimal choice for p 
(up to a constant factor), such that the best total complexity is obtained. For 
simplicity of the exposition, we assume ||x*|| > 2. 
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Theorem 4 Let p,e > 0, po = | and t k = . Assume that, at each outer it¬ 

eration k of the IFAL(p, e) Algorithm, Nesterov’s optimal method m is called 
for computing an approximate solution u p [x k ) of the inner subproblem © 
such that: 

C Qp (up{x k ), x k ) - d p (x k ) < 4 2( . fc £ +3 ^ • 

Then, by choosing the smoothing parameter: 

16||z*|| 2 


for attaining an e— 
most: 


optimal solution of ©, the IFAL Algorithm performs at 


j SLfPl 


4 ^ 6 ^ 1101111**11 


e 


projections on the simple set U. 


Proof First, we observe that for p > the IFAL Algorithm 

a single outer iteration. Indeed, from m one can obtain 7 < e 1//2 
that: 


p>— and 
e 



performs 

provided 


It can be seen that any p such that p > 16 ^ ^ satisfies both conditions. In 
this case, when a single outer iteration is sufficient, the total computational 
complexity is given by the inner complexity estimate (1141) : 


6 LfDjj 


+ 


4y / 6£)i J ||G||||a:*|| 

e 


On the other hand, if p < 
as follows: 


1 s 11 a- -11 2 


then from (fl5l) we can further bound N out 


2 5 / 2 \\x* 
(pe) 1 ' 2 ’ 


(16) 


Using the same inner complexity estimate m of the Nesterov’s optimal 
method, the total computational complexity is given by: 


JV out +l 

£ Njf<2Pu 


k=0 


2(L f + p\\Gf) 


-, 1/2 


(N° 


+ 4 ) 3/2 


< 2(N° 


■4 f /2 \j 


2LfPl 


+ 2D c/ ||G||(7V out 



Using m for the second term in the right hand side of the above estimate, and 

211 / 3 11 x * 11 2 

optimizing over p we obtain that, for the optimal parameter p* = --——, 

the necessary number of outer iterations is at most 2. In conclusion, the choice 
16 ^ ;z ^ is optimal up to a constant factor. □ 
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If one knows a priori an upper bound on ||a;*||, then the previous result in¬ 
dicates that a proper choice of p determines that a single outer iteration of the 
IFAL Algorithm to be sufficient to attain e-primal optimality and the overall 
iteration complexity is of order 0{\), which is better than 0(-^) for any 
p > 0. However, in practice ||a;*|| is unknown and thus the optimal smoothing 
parameter p cannot be computed. In order to cope with this problem, we pro¬ 
vide further an implementable adaptive variant of the IFAL Algorithm, which 
has the same total complexity given in Theorem [4] (up to a logarithmic fac¬ 
tor). The Adaptive Inexact Fast augmented Lagrangian (A-IFAL) algorithm 
relies on a search procedure which is used typically for penalty and augmented 
Lagrangian methods in the case when a bound on the optimal Lagrange mul¬ 
tipliers is unknown (see, e.g., ED- 

A-IFAL(/3 0 ,e) Algorithm 

Initialization: Choose po, e > 0, po = -L and (u°,x°) such that A 0 < 5q 

Iterations: For k = 1,2,..., perform: 

1. Starting from (u k ~ 1 ,x k ~ 1 ), run a single iteration of the IFAL(pfc,e) 
Algorithm and obtain the output: ( u k ,x k ) 

2. If || Gu k + g\\ < e then STOP; otherwise, update: k = k + 1, pk+i = 
2 pk and return to step 2. 

End 


We notice that the A-IFAL Algorithm can be regarded as a variant of the IFAL 
Algorithm with variable increasing smoothness parameter p and constant inner 
accuracy <5 = |. The following result provides the overall complexity of the 
A-IFAL Algorithm necessary for attaining an e—optimal solution. 

Theorem 5 Let {(u fe ,x fc )} be the sequences generated by the A-IFAL Algo¬ 
rithm. Then, for attaining an e—optimal solution of the following total 
number of projections on U need to be performed: 

l 0g2 ^j ULfDl | 80y^£M|G|||| a ;*|| . 

Proof We observe that the maximum number of outer iterations performed by 
the A-IFAL Algorithm is given by = log 2 ^ 16 |^ ^. Thus the overall 

iteration complexity is given by: 


N° 


k =0 


j2 N i n <K 

= K 
< log 2 


out 4 1 ML f Du 


E 

k =0 


2V6D u \\G\\p 1 k /2 


d/2 


out 

max 


24 L f Dl 2V&D u \\G\\pl / ‘ 2 V? 


- 1 


F l/2 


V2-1 

16||:r*|| 2 V/24 lJd^ , 80V3^||G||||xi 


□ 
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It is important to note that the adaptive A-IFAL algorithm has the same 
computational complexity as the IFAL algorithm, up to a logarithmic factor. 
Moreover, both algorithms are implementable, i.e., they can be stopped based 
on verifiable stopping criteria and their parameters can be easily computed. 


5 Comparison with other augmented Lagrangian complexity results 

In this section we compare the computational complexity and other features 
of the IFAL/A-IFAL Algorithm with previous works and complexity results 
on AL methods. 

Given x € U, r > 0, we use the notations B r {x) = {y € M n |||y|| < r} and 
A fu(x) = {y € M ra |(?/,2 — a;) < 0 \/z £ U}. Then, in [7], the authors analyze 
the classical AL method for the same class of problems ©. They developed an 
implementable variant of the classical AL method to obtain an e—suboptimal 
primal-dual pair ( u e ,x t ) satisfying the following criteria: 

V/(u e ) + G T x e e — Nu(u e ) + B e (0) and \\Gu e + g\\ < e. (17) 

The authors provide their own iteration complexity analysis for the augmented 
Lagrangian method using the inexact dual gradient algorithm and without any 
artificial perturbation on the problem they obtained that it is necessary to per¬ 
form O (-fji) projections on the simple set U in order to obtain a primal-dual 
pair satisfying ED- An important remark is that, for some p > 0, the method 
in [7] requires a priori a pre-specified number of the outer iterations in order 
to compute the inner accuracy and to terminate the algorithm. Moreover, to 
satisfy our e—optimality definition, an average primal iterate u k = j y~b_ n u l 
has to be computed (see [12])- Further, for any fixed p, 0 {outer itera¬ 
tions has to be performed and the inner accuracy has to be chosen of the form 

2 3 

5k = 0(jj^Tj| )• In this case, the method from [7] requires: 

2^1,fill-1| ^ 2lG||C t ,||x-||^ 

p 2 e 5/2 + p 3/2 e 5/2 ^ V 

total projections on the set U, provided that p < 11 g 11 —. However, we 

observe that for p = 0(||x*||/e), from (fl8ll the overall iteration complexity 
is of order 0{ 1/e) (as in the present paper), while for an arbitrary constant 
parameter p , the complexity estimates are much worse than those given in this 
paper. 

It is important to note that although the inexact excessive gap methods 
introduced in [THIEI! are similar to the IFAL Algorithm and the authors also 
provide outer complexity estimates of order O (A 2 ) f° r a fixed p , the update 
rules for the inner accuracy in PEES induce difficulties in the derivation of 
the total complexity. Moreover, assuming one implements the update rule of 
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the inner accuracy 5k given e.g. by m Theorem 5.1], at each outer iteration 
it is required a primal iterate u p {x) satisfying: 



For a small constant p and high accuracy, the theoretical complexity of the 
algorithm can be very pessimistic. Moreover, from our previous analysis we 
can conclude that for an adequate choice of the parameter p the number of 
outer iterations is 1 and therefore outer complexity estimates are irrelevant 
for the total complexity of the method. 

Other recent complexity results concerning the classical AL method are 
given, e.g., in mmm- For example, in S] an adaptive classical AL method 
for cone constrained convex optimization models is analyzed. However, the 
method in 2] is not entirely implementable (the stopping criteria cannot be 
verified) and the inner accuracy is constrained to be of order: 



where /3 > 1. The authors in [2j show that the outer complexity is of order 
0(log(l/e)) and thus, the total complexity is similar with the estimates given 
in our paper (up to a logarithmic factor). However, our IFAL Algorithm rep¬ 
resents an accelerated augmented Lagrangian method based on the excessive 
gap theory and moreover, it can be easily implemented in practice, i.e., it can 
be stopped based on verifiable stopping criteria and the parameters can be 
easily computed. 

6 Concluding remarks 

We have analyzed the iteration complexity of several inexact accelerated first- 
order augmented Lagrangian methods for solving linearly constrained convex 
optimization problems. We have computed the optimal choice of the penalty 
parameter p and by means of smoothing techniques and excessive gap-like con¬ 
dition, we provided estimates on the overall computational complexity of these 
algorithms. We compared our theoretical results with other existing results in 
the literature. The implementation of these algorithms, numerical simulations, 
and comparison can be found in the forthcoming full paper. 
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